Search | WHO COVID-19 Research Database

MODELING OF PRE-TRAINED NEURAL NETWORK EMBEDDINGS LEARNED FROM RAW WAVEFORM FOR COVID-19 INFECTION DETECTION

Mostaani, Z.; Prasad, R.; Vlasenko, B.; Magimai-Doss, M..

47th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2022 ; 2022-May:8482-8486, 2022.

Article in English | Scopus | ID: covidwho-1891390

ABSTRACT

COVID-19 is a respiratory system disorder that can disrupt the function of lungs. Effects of dysfunctional respiratory mechanism can reflect upon other modalities which function in close coupling. Audio signals result from modulation of respiration through speech production system, and hence acoustic information can be modeled for detection of COVID-19. In that direction, this paper is addressing the second DiCOVA challenge that deals with COVID-19 detection based on speech, cough and breathing. We investigate modeling of (a) ComParE LLD representations derived at frame- and turn-level resolutions and (b) neural representations obtained from pre-trained neural networks trained to recognize phones and estimate breathing patterns. On Track 1, the ComParE LLD representations yield a best performance of 78.05% area under the curve (AUC). Experimental studies on Track 2 and Track 3 demonstrate that neural representations tend to yield better detection than ComParE LLD representations. Late fusion of different utterance level representations of neural embeddings yielded a best performance of 80.64% AUC. © 2022 IEEE

A comparison of acoustic and linguistics methodologies for Alzheimer's dementia recognition

Cummins, N.; Pan, Y.; Ren, Z.; Fritsch, J.; Nallanthighal, V. S.; Christensen, H.; Blackburn, D.; Schuller, B. W.; Magimai-Doss, M.; Strik, H.; Härmä, A..

Proc. Annu. Conf. Int. Speech. Commun. Assoc., INTERSPEECH ; 2020-October:2182-2186, 2020.

Article in English | Scopus | ID: covidwho-1005298

ABSTRACT

In the light of the current COVID-19 pandemic, the need for remote digital health assessment tools is greater than ever. This statement is especially pertinent for elderly and vulnerable populations. In this regard, the INTERSPEECH 2020 Alzheimer's Dementia Recognition through Spontaneous Speech (ADReSS) Challenge offers competitors the opportunity to develop speech and language-based systems for the task of Alzheimer's Dementia (AD) recognition. The challenge data consists of speech recordings and their transcripts, the work presented herein is an assessment of different contemporary approaches on these modalities. Specifically, we compared a hierarchical neural network with an attention mechanism trained on linguistic features with three acoustic-based systems: (i) Bag-of-Audio-Words (BoAW) quantising different low-level descriptors, (ii) a Siamese Network trained on log-Mel spectrograms, and (iii) a Convolutional Neural Network (CNN) end-to-end system trained on raw waveforms. Key results indicate the strength of the linguistic approach over the acoustics systems. Our strongest test-set result was achieved using a late fusion combination of BoAW, End-to-End CNN, and hierarchical-attention networks, which outperformed the challenge baseline in both the classification and regression tasks. Copyright © 2020 ISCA

ABSTRACT

ABSTRACT

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL